Skip to main content

Common Kubernetes Deployment Issues and How to Fix

This document refers to the most common Kubernetes deployment problems, their likely causes, and steps to fix them.

1. ImagePullBackOff / ErrImagePull

Problem:

The pod cannot pull the container image.

Possible Causes:

  • Incorrect image name or tag
  • Image is hosted in a private registry without authentication
  • Rate limiting on public registries (e.g., DockerHub)

How to Fix:

  • Check the image name and tag using kubectl describe pod <pod-name>
  • If using a private registry, create and apply an imagePullSecret:
kubectl create secret docker-registry myregistrykey \
--docker-username=<user> \
--docker-password=<password> \
--docker-server=<registry>

Add to your pod or deployment:

imagePullSecrets:
- name: myregistrykey

2. CrashLoopBackOff

Problem:

The container is repeatedly crashing and restarting.

Possible Causes:

  • The application inside the container is exiting unexpectedly
  • Invalid configuration or environment variables
  • Failing liveness or readiness probes

How to Fix:

  • Retrieve logs with:
    k  ubectl logs <pod-name> --previous 
  • Verify entrypoint, startup scripts, and environment variables
  • Use initContainers if dependencies need to be initialized first

3. Pending Pods

Problem:

  • Pods remain in a Pending state and are not scheduled.

Possible Causes:

  • Requested resources (CPU, memory, GPU) exceed what’s available
  • Taints on nodes prevent scheduling
  • Affinity or anti-affinity rules cannot be satisfied

How to Fix:

  • Describe the pod:
    kubectl describe pod <pod-name> 
  • Adjust resource requests/limits or scale your cluster
  • Add tolerations to your pod spec if needed:
    tolerations:
    - key: "example"
    operator: "Exists"
    effect: "NoSchedule"

4. Service Not Reachable

Problem:

A pod or service cannot connect to another service using its DNS name.

Possible Causes:

  • Incorrect service selectors
  • No matching pods (endpoints list is empty)
  • DNS resolution issues

How to Fix:

  • Check the service:
    kubectl describe svc <svc-name> 
  • Make sure the pods have labels that match the service selector
  • Verify the service has active endpoints:
    kubectl get endpoints <svc-name> 

5. ConfigMap or Secret Not Found

Problem:

Pod startup fails due to missing ConfigMap or Secret.

Possible Causes:

  • The ConfigMap or Secret doesn't exist
  • Typo in the resource name
  • Resource exists in a different namespace

How to Fix:

  • Ensure the ConfigMap or Secret is created in the correct namespace
  • Validate the names and keys used in the pod spec

6. Liveness or Readiness Probe Failures

Problem:

Probes fail, causing the pod to restart or remain unready.

Possible Causes:

  • Application takes time to become ready
  • Incorrect path or port specified
  • Probes are too aggressive

How to Fix:

  • Add initialDelaySeconds and adjust probe intervals:
    initialDelaySeconds: 10
    periodSeconds: 5
  • Validate the health endpoint inside the container is working

7. Deployment Not Updating

Problem:

Changes to the deployment do not result in new pods being created.

Possible Causes:

  • No actual changes in the pod template
  • Deployment rollout is paused

How to Fix:

  • Ensure the pod spec changes (e.g., image tag, env var)
  • Force a rollout if needed:
kubectl rollout restart deployment <deployment-name> 

8. PVC Pending

Problem:

PersistentVolumeClaim (PVC) remains in Pending state.

Possible Causes:

  • No available PersistentVolume (PV)
  • StorageClass does not exist or does not match
  • Mismatch in requested size or access mode

How to Fix:

  • Check available PVs and StorageClasses:
    kubectl rollout restart deployment <deployment-name> 
  • Ensure a PV with matching size, access mode, and storage class exists

9. Node Pressure (Disk/CPU/Memory)

Problem:

Pods are evicted or not scheduled due to node resource constraints.

Possible Causes:

  • Node is under resource pressure
  • Kubelet evicts pods when thresholds are breached

How to Fix:

  • Inspect node status:
    kubectl describe node <node-name> 
  • Free up resources or reschedule workloads
  • Tune kubelet eviction settings if necessary

10. RBAC Permission Denied

Problem:

Service account or user is denied permission for an action.

Possible Causes:

  • Missing Role or ClusterRole
  • No RoleBinding or ClusterRoleBinding assigned

How to Fix:

  • Test access using:
    kubectl auth can-i get pods --as=system:serviceaccount:<namespace>:<service-account> 
  • Apply the appropriate Role or ClusterRole and bind it to the service account